Search CORE

26 research outputs found

Recommended from our members

Co-complex protein membership evaluation using Maximum Entropy on GO ontology and InterPro annotation.

Author: Armean Irina M
Holden Sean B
Lilley Kathryn S
Pilkington Nicholas CV
Trotter Matthew WB
Publication venue: Bioinformatics
Publication date: 30/01/2018
Field of study

MOTIVATION: Protein-protein interactions (PPI) play a crucial role in our understanding of protein function and biological processes. The standardization and recording of experimental findings is increasingly stored in ontologies, with the Gene Ontology (GO) being one of the most successful projects. Several PPI evaluation algorithms have been based on the application of probabilistic frameworks or machine learning algorithms to GO properties. Here, we introduce a new training set design and machine learning based approach that combines dependent heterogeneous protein annotations from the entire ontology to evaluate putative co-complex protein interactions determined by empirical studies. RESULTS: PPI annotations are built combinatorically using corresponding GO terms and InterPro annotation. We use a S.cerevisiae high-confidence complex dataset as a positive training set. A series of classifiers based on Maximum Entropy and support vector machines (SVMs), each with a composite counterpart algorithm, are trained on a series of training sets. These achieve a high performance area under the ROC curve of ≤0.97, outperforming go2ppi-a previously established prediction tool for protein-protein interactions (PPI) based on Gene Ontology (GO) annotations. AVAILABILITY AND IMPLEMENTATION: https://github.com/ima23/maxent-ppi. CONTACT: [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

Apollo (Cambridge)

In vivo analysis of proteomes and interactomes using Parallel Affinity Capture (iPAC) coupled to mass spectrometry.

Author: Armean Irina M
Drummond Emma
Johnson Glynnis
Lilley Kathryn S
Lowe Nick
Rees Johanna S
Roote John
Russell Steven
Ryder Edward
Spriggs Helen
St Johnston Daniel
Publication venue: Mol Cell Proteomics
Publication date: 29/03/2011
Field of study

Affinity purification coupled to mass spectrometry provides a reliable method for identifying proteins and their binding partners. In this study we have used Drosophila melanogaster proteins triple tagged with Flag, Strep II, and Yellow fluorescent protein in vivo within affinity pull-down experiments and isolated these proteins in their native complexes from embryos. We describe a pipeline for determining interactomes by Parallel Affinity Capture (iPAC) and show its use by identifying partners of several protein baits with a range of sizes and subcellular locations. This purification protocol employs the different tags in parallel and involves detailed comparison of resulting mass spectrometry data sets, ensuring the interaction lists achieved are of high confidence. We show that this approach identifies known interactors of bait proteins as well as novel interaction partners by comparing data achieved with published interaction data sets. The high confidence in vivo protein data sets presented here add new data to the currently incomplete D. melanogaster interactome. Additionally we report contaminant proteins that are persistent with affinity purifications irrespective of the tagged bait.This project is funded by the Welcome Trust.This is the final version of the article. It was first available from ASBMB via http://dx.doi.org/10.1074/mcp.M110.00238

PubMed Central

Apollo (Cambridge)

Analysis of the expression patterns, subcellular localisations and interaction partners of Drosophila proteins using a pigP protein trap library.

Author: Armean Irina M
Armstrong J Douglas
Bastock Rebecca
Drummond Emma
Drummond Jenny
Hansen Celia
Huelsmann Sven
Johnson Glynnis
Knowles-Barley Seymour
Landgraf Matthias
Lilley Kathryn S
Lowe Nick
Magbanua Jose P
Naylor Huw
Phillips Roger G
Rees Johanna S
Roote John
Russell Steven
Ryder Ed
Sanson Bénédicte
Spriggs Helen
St Johnston Daniel
Trovisco Vitor
UK Drosophila Protein Trap Screening Consortium
White-Cooper Helen
Publication venue: 'American Association on Intellectual and Developmental Disabilities (AAIDD)'
Publication date: 01/01/2014
Field of study

Although we now have a wealth of information on the transcription patterns of all the genes in the Drosophila genome, much less is known about the properties of the encoded proteins. To provide information on the expression patterns and subcellular localisations of many proteins in parallel, we have performed a large-scale protein trap screen using a hybrid piggyBac vector carrying an artificial exon encoding yellow fluorescent protein (YFP) and protein affinity tags. From screening 41 million embryos, we recovered 616 verified independent YFP-positive lines representing protein traps in 374 genes, two-thirds of which had not been tagged in previous P element protein trap screens. Over 20 different research groups then characterized the expression patterns of the tagged proteins in a variety of tissues and at several developmental stages. In parallel, we purified many of the tagged proteins from embryos using the affinity tags and identified co-purifying proteins by mass spectrometry. The fly stocks are publicly available through the Kyoto Drosophila Genetics Resource Center. All our data are available via an open access database (Flannotator), which provides comprehensive information on the expression patterns, subcellular localisations and in vivo interaction partners of the trapped proteins. Our resource substantially increases the number of available protein traps in Drosophila and identifies new markers for cellular organelles and structures.This work was supported by a project grant from the Wellcome Trust [076739], by a Wellcome Trust Principal Research Fellowship to D.StJ. [049818 and 080007], and by core support from the Wellcome Trust [092096] and Cancer Research UK [A14492].This is the final version of the article. It was first available from The Company of Biologists via http://dx.doi.org/10.1242/dev.11105

Online Research @ Cardiff

PubMed Central

Edinburgh Research Explorer

Apollo (Cambridge)

King's Research Portal

Leicester Research Archive

The effect of LRRK2 loss-of-function variants in humans

Author: 23andMe Research Team
Alföldi Jessica
Alipanahi Babak
Armean Irina M.
Banks Eric
Baptista Marco A.S.
Bergelson Louis
Cibulskis Kristian
Cole Joanne B.
Collins Ryan L.
Connolly Kristen M.
Covarrubias Miguel
Cummings Beryl
Daly Mark J.
Donnelly Stacey
Farjoun Yossi
Ferriera Steven
Francioli Laurent
Gabriel Stacey
Gauthier Laura D.
Genome Aggregation Database Consortium
Genome Aggregation Database Production Team
Gentry Jeff
Goodrich Julia K.
Guan Anna
Gupta Namrata
Jeandet Thibault
Kaplan Diane
Karczewski Konrad J.
Kleinman Aaron
Laricchia Kristen M.
Lehtimäki Terho
Llanwarne Christopher
Marshall Jamie L.
Mattila Kari M.
Merchant Kalpana M.
Minikel Eric V.
Morrison Peter
Munshi Ruchi
Neale Benjamin M.
Novod Sam
O’Donnell-Luria Anne H.
Petrillo Nikelle
Quaife Nicholas M.
Suvisaari Jaana
Wang Qingbo
Whiffin Nicola
Publication venue
Publication date: 01/01/2020
Field of study

Analysis of large genomic datasets, including gnomAD, reveals that partial LRRK2 loss of function is not strongly associated with diseases, serving as an example of how human genetics can be leveraged for target validation in drug discovery. Human genetic variants predicted to cause loss-of-function of protein-coding genes (pLoF variants) provide natural in vivo models of human gene inactivation and can be valuable indicators of gene function and the potential toxicity of therapeutic inhibitors targeting these genes(1,2). Gain-of-kinase-function variants in LRRK2 are known to significantly increase the risk of Parkinson's disease(3,4), suggesting that inhibition of LRRK2 kinase activity is a promising therapeutic strategy. While preclinical studies in model organisms have raised some on-target toxicity concerns(5-8), the biological consequences of LRRK2 inhibition have not been well characterized in humans. Here, we systematically analyze pLoF variants in LRRK2 observed across 141,456 individuals sequenced in the Genome Aggregation Database (gnomAD)(9), 49,960 exome-sequenced individuals from the UK Biobank and over 4 million participants in the 23andMe genotyped dataset. After stringent variant curation, we identify 1,455 individuals with high-confidence pLoF variants in LRRK2. Experimental validation of three variants, combined with previous work(10), confirmed reduced protein levels in 82.5% of our cohort. We show that heterozygous pLoF variants in LRRK2 reduce LRRK2 protein levels but that these are not strongly associated with any specific phenotype or disease state. Our results demonstrate the value of large-scale genomic databases and phenotyping of human loss-of-function carriers for target validation in drug discovery.Peer reviewe

Lund University Publications

Julkari

Spiral - Imperial College Digital Repository

Helsingin yliopiston digitaalinen arkisto

University of Dundee Online Publications

Trepo - Institutional Repository of Tampere University

The ELIXIR Human Copy Number Variations Community:building bioinformatics infrastructure for research

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While 'High-Throughput' sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established h uman CNV Community, with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Directory of Open Access Journals

Dissertations of the University of Groningen

Ensembl Genomes 2016: more genomes, more complexity

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Edinburgh Research Explorer

Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity

Author: A Casu
A Manichaikul
A McKenna
AB Jørgensen
AH Bittles
AL Price
Anis Memon
Anne H. O’Donnell-Luria
Asif Rasheed
Atif Imran
B Georgi
BA Carr
Benjamin Weisburd
D Christiansen
D Eisenberg
D Gaudet
D Gaudet
D Saleheen
Daniel G. MacArthur
Daniel J. Rader
Danish Saleheen
E Di Angelantonio
EM Scott
Eric S. Lander
ES Lander
Faisal Majeed
Fazal-ur-Rehman Memon
G Jun
GX Wang
H Hunter-Zinck
H Li
HD White
Hong-Hee Won
IA Murray
Irina M. Armean
J Crosby
JA Tennessen
John Danesh
JS Kooner
JT Eppig
K Dahl
Kaitlin E. Samocha
KE Samocha
Kevin Trindade
Khalid Mahmood
Khan Shah Zaman
Konrad J. Karczewski
L Gold
LM Polfus
M Fuchs
M Lek
M Maraki
MA DePristo
Madiha Ishaq
Maria Samuel
Mark J. Daly
Megan Mucksavage
MJ Graham
ML O’Donoghue
Mohammad Ishaq
Mozzam Zaidi
MR Schneider
MW Huff
Nadeem Hayyat Mallick
Nadeem Qamar
Namrata Gupta
Naveeduddin Ahmed
P Sulem
Philippe Frossard
Pradeep Natarajan
R Do
R Murtazina
RD Mosteller
Ron Do
Ronald M. Krauss
S De Rubeis
S Fisher
S Purcell
S Wright
Saba Akhtar
SD Brown
Sekar Kathiresan
Shahid Abbas
SM Purcell
Stacey Gabriel
Sumeet A. Khetarpal
Syed Nadeem Hasan Rizvi
Syed Zahed Rasheed
T Wang
Tahir Saghir
TG Fazzio
TI Pollin
TJ Standiford
VM Narasimhan
W McLaren
Wei Zhao
Z Karim
Zia Yaqoob
Publication venue: Nature
Publication date: 01/04/2017
Field of study

A major goal of biomedicine is to understand the function of every gene in the human genome. Loss-of-function mutations can disrupt both copies of a given gene in humans and phenotypic analysis of such 'human knockouts' can provide insight into gene function. Consanguineous unions are more likely to result in offspring carrying homozygous loss-of-function mutations. In Pakistan, consanguinity rates are notably high. Here we sequence the protein-coding regions of 10,503 adult participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS), designed to understand the determinants of cardiometabolic diseases in individuals from South Asia. We identified individuals carrying homozygous predicted loss-of-function (pLoF) mutations, and performed phenotypic analysis involving more than 200 biochemical and disease traits. We enumerated 49,138 rare (<1% minor allele frequency) pLoF mutations. These pLoF mutations are estimated to knock out 1,317 genes, each in at least one participant. Homozygosity for pLoF mutations at PLA2G7 was associated with absent enzymatic activity of soluble lipoprotein-associated phospholipase A2; at CYP2F1, with higher plasma interleukin-8 concentrations; at TREH, with lower concentrations of apoB-containing lipoprotein subfractions; at either A3GALT2 or NRG4, with markedly reduced plasma insulin C-peptide concentrations; and at SLC9A3R1, with mediators of calcium and phosphate signalling. Heterozygous deficiency of APOC3 has been shown to protect against coronary heart disease; we identified APOC3 homozygous pLoF carriers in our cohort. We recruited these human knockouts and challenged them with an oral fat load. Compared with family members lacking the mutation, individuals with APOC3 knocked out displayed marked blunting of the usual post-prandial rise in plasma triglycerides. Overall, these observations provide a roadmap for a 'human knockout project', a systematic effort to understand the phenotypic consequences of complete disruption of genes in humans.D.S. is supported by grants from the National Institutes of Health, the Fogarty International, the Wellcome Trust, the British Heart Foundation, and Pfizer. P.N. is supported by the John S. LaDue Memorial Fellowship in Cardiology from Harvard Medical School. H.-H.W. is supported by a grant from the Samsung Medical Center, Korea (SMO116163). S.K. is supported by the Ofer and Shelly Nemirovsky MGH Research Scholar Award and by grants from the National Institutes of Health (R01HL107816), the Donovan Family Foundation, and Fondation Leducq. Exome sequencing was supported by a grant from the NHGRI (5U54HG003067-11) to S.G. and E.S.L. D.G.M. is supported by a grant from the National Institutes of Health (R01GM104371). J.D. holds a British Heart Foundation Chair, European Research Council Senior Investigator Award, and NIHR Senior Investigator Award. The Cardiovascular Epidemiology Unit at the University of Cambridge, which supported the field work and genotyping of PROMIS, is funded by the UK Medical Research Council, British Heart Foundation, and NIHR Cambridge Biomedical Research Centre ... Fieldwork in the PROMIS study has been supported through funds available to investigators at the Center for Non-Communicable Diseases, Pakistan and the University of Cambridge, UK

Crossref

Harvard University - DASH

eScholarship - University of California

Apollo (Cambridge)

The ELIXIR Human Copy Number Variations Community: building bioinformatics infrastructure for research

Author: Armean Irina M
Baudis Michael
et al
Salgado David
Publication venue: 'F1000 Research Ltd'
Publication date: 13/10/2020
Field of study

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While “High-Throughput” sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR’s recently established human CNV Community, with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context. Keywords Copy Number Variation, Data analysis, next-generation sequencing, whole genome sequencing, Human Genetics, Oncogenetics, Common Diseases, Federated Human Dat

ZORA

The genome of the biting midge Culicoides sonorensis and gene expression analyses of vector competence for bluetongue virus

Author: Armean Irina M
Campbell Lahcen
Carpenter Simon
Fife Mark
Gonzalez-Uriarte Asier
Harrup Lara E
Hinsley Malcolm
Kersey Paul
Morales-Hojas Ramiro
Nayduch Dana
Saski Christopher
Silk Rhiannon
Tabachnick Walter J
Veronesi Eva
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

BACKGROUND: The new genomic technologies have provided novel insights into the genetics of interactions between vectors, viruses and hosts, which are leading to advances in the control of arboviruses of medical importance. However, the development of tools and resources available for vectors of non-zoonotic arboviruses remains neglected. Biting midges of the genus Culicoides transmit some of the most important arboviruses of wildlife and livestock worldwide, with a global impact on economic productivity, health and welfare. The absence of a suitable reference genome has hindered genomic analyses to date in this important genus of vectors. In the present study, the genome of Culicoides sonorensis, a vector of bluetongue virus (BTV) in the USA, has been sequenced to provide the first reference genome for these vectors. In this study, we also report the use of the reference genome to perform initial transcriptomic analyses of vector competence for BTV. RESULTS: Our analyses reveal that the genome is 189 Mb, assembled in 7974 scaffolds. Its annotation using the transcriptomic data generated in this study and in a previous study has identified 15,612 genes. Gene expression analyses of C. sonorensis females infected with BTV performed in this study revealed 165 genes that were differentially expressed between vector competent and refractory females. Two candidate genes, glutathione S-transferase (gst) and the antiviral helicase ski2, previously recognized as involved in vector competence for BTV in C. sonorensis (gst) and repressing dsRNA virus propagation (ski2), were confirmed in this study. CONCLUSIONS: The reference genome of C. sonorensis has enabled preliminary analyses of the gene expression profiles of vector competent and refractory individuals. The genome and transcriptomes generated in this study provide suitable tools for future research on arbovirus transmission. These provide a valuable resource for these vector lineage, which diverged from other major Dipteran vector families over 200 million years ago. The genome will be a valuable source of comparative data for other important Dipteran vector families including mosquitoes (Culicidae) and sandflies (Psychodidae), and together with the transcriptomic data can yield potential targets for transgenic modification in vector control and functional studies

ZORA

Comparative evolutionary analyses of eight whitefly Bemisia tabaci sensu lato genomes: cryptic species, agricultural pests and plant-virus vectors

The genomes, transcriptomes, and predicted protein-coding sequences are available from Ensembl Metazoa (http://metazoa.ensembl.org) and are included within the references. Raw RNA-Seq datasets generated and/or analyzed during the current study are available from the European Nucleotide Archive database repository (https://www.ebi.ac.uk/ena) under the parent project accessions: PRJEB28507, PRJEB36965, PRJEB35304, PRJEB39408. All data generated during the analyses of these datasets are included in this published article, supplementary information files, and figshare repository (https://doi.org/10.6084/m9.figshare.23666799; https://doi.org/10.6084/m9.figshare.23666832.v4; https://doi.org/10.6084/m9.figshare.23666844).International audienceBackground: The group of > 40 cryptic whitefly species called Bemisia tabaci sensu lato are amongst the world's worst agricultural pests and plant-virus vectors. Outbreaks of B. tabaci s.l. and the associated plant-virus diseases continue to contribute to global food insecurity and social instability, particularly in sub-Saharan Africa and Asia. Published B. tabaci s.l. genomes have limited use for studying African cassava B. tabaci SSA1 species, due to the high genetic divergences between them. Genomic annotations presented here were performed using the 'Ensembl gene annotation system' , to ensure that comparative analyses and conclusions reflect biological differences, as opposed to arising from different methodologies underpinning transcript model identification. Results: We present here six new B. tabaci s.l. genomes from Africa and Asia, and two re-annotated previously published genomes, to provide evolutionary insights into these globally distributed pests. Genome sizes ranged between 616-658 Mb and exhibited some of the highest coverage of transposable elements reported within Arthropoda. Many fewer total protein coding genes (PCG) were recovered compared to the previously published B. tabaci s.l. genomes and structural annotations generated via the uniform methodology strongly supported a repertoire of between 12.8-13.2 × 10 3 PCG. An integrative systematics approach incorporating phylogenomic analysis of nuclear and mitochondrial markers supported a monophyletic Aleyrodidae and the basal positioning of B. tabaci Uganda-1 to the sub-Saharan group of species. Reciprocal cross-mating data and the co-cladogenesis pattern of the primary obligate endosymbiont 'Candidatus Portiera aleyrodidarum' from 11 Bemisia genomes further supported the phylogenetic reconstruction to show that African cassava B. tabaci populations consist of just three biological species. We include comparative analyses of gene families related to detoxification, sugar metabolism, vector competency and evaluate the presence and function of horizontally transferred genes, essential for understanding the evolution and unique biology of constituent B. tabaci. s.l species.Conclusions: These genomic resources have provided new and critical insights into the genetics underlying B. tabaci s.l. biology. They also provide a rich foundation for post-genomic research, including the selection of candidate gene-targets for innovative whitefly and virus-control strategies

INRIA a CCSD electronic archive server

HAL-IRD

HAL-CIRAD